Deeprebirth: a General Approach for Accel- Erating Deep Neural Network Execution on Mobile Devices
نویسندگان
چکیده
Deploying deep neural networks on mobile devices is a challenging task due to computation complexity and memory intensity. Existing works solve this problem by reducing model size using weight compression methods based on dimension reduction (i.e., SVD, Tucker decomposition and Quantization). However, the execution speed of these compressed models are still far below the real-time processing requirement of mobile services. To address this limitation, we propose a novel acceleration framework: DeepRebirth by exploring the deep learning model parameter sparsity through merging the parameter-free layers with their neighbor convolution layers to a single dense layer. The design of DeepRebirth is motivated by the key observation: some layers (i.e., normalization and pooling) in deep learning models actually consume a large portion of computational time even few learned parameters are involved, and acceleration of these layers has the potential to improve the processing speed significantly. Essentially, the functionality of several merged layers is replaced by the new dense layer – rebirth layer in DeepRebirth. In order to preserve the same functionality, the rebirth layer model parameters are re-trained to be functionality equivalent to the original several merged layers. The extensive experiments performed on ImageNet using several popular mobile devices demonstrate that DeepRebirth is not only providing huge speed-up in model deployment and significant memory saving but also maintaining the model accuracy, i.e., 3x-5x speed-up and energy saving on GoogLeNet with only 0.4% accuracy drop on top-5 categorization in ImageNet. Further, by combining with other model compression techniques, DeepRebirth offers an average of 65ms model forwarding time on single image using Samsung Galaxy S6 with only 2.4% accuracy drop. In addition, 2.5x run-time memory saving is achieved with rebirth layers.
منابع مشابه
DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices
Deploying deep neural networks on mobile devices is a challenging task. Current model compression methods such as matrix decomposition effectively reduce the deployed model size, but still cannot satisfy real-time processing requirement. This paper first discovers that the major obstacle is the excessive execution time of non-tensor layers such as pooling and normalization without tensor-like t...
متن کاملAn Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network
In recent years, there has been considerable interest among people to use short message service (SMS) as one of the essential and straightforward communications services on mobile devices. The increased popularity of this service also increased the number of mobile devices attacks such as SMS spam messages. SMS spam messages constitute a real problem to mobile subscribers; this worries telecomm...
متن کاملNavigation of a Mobile Robot Using Virtual Potential Field and Artificial Neural Network
Mobile robot navigation is one of the basic problems in robotics. In this paper, a new approach is proposed for autonomous mobile robot navigation in an unknown environment. The proposed approach is based on learning virtual parallel paths that propel the mobile robot toward the track using a multi-layer, feed-forward neural network. For training, a human operator navigates the mobile robot in ...
متن کاملDXTK : Enabling Resource-efficient Deep Learning on Mobile and Embedded Devices with the DeepX Toolkit
Deep learning is having a transformative effect on how sensor data are processed and interpreted. As a result, it is becoming increasingly feasible to build sensor-based computational models that are much more robust to real-world noise and complexity than previously possible. It is paramount that these innovations reach mobile and embedded devices that often rely on understanding and reacting ...
متن کاملMCDNN: An Execution Framework for Deep Neural Networks on Resource-Constrained Devices
Deep Neural Networks (DNNs) have become the computational tool of choice for many applications relevant to mobile devices. However, given their high memory and computational demands, running them on mobile devices has required expert optimization or custom hardware. We present a framework that, given an arbitrary DNN, compiles it down to a resource-efficient variant at modest loss in accuracy. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016